Classification and Regression Tree (CART) Techniques
ثبت نشده
چکیده
In certain research studies development of a reliable decision rule, which can be used to classify new observations into some predefined categories, plays an important role. The existing traditional statistical methods are inappropriate to use in certain specific situations, or of limited utility, in addressing these types of classification problems. There are a number of reasons for these difficulties. First, there are generally many possible “predictor” variables which makes the task of variable selection difficult. Traditional statistical methods are poorly suited for this sort of multiple comparisons. Second, the predictor variables are rarely nicely distributed. Many variables (in agriculture and other real life situations) are not normally distributed and different groups of subjects may have markedly different degrees of variation or variance. Third, complex interactions or patterns may exist in the data. For example, the value of one variable (e.g., age) may substantially affect the importance of another variable (e.g., weight). These types of interactions are generally difficult to model and virtually impossible to model when the number of interactions and variables becomes substantial. Fourth, the results of traditional methods may be difficult to use. For example, a multivariate logistic regression model yields a probability for different classes of the dependent variable, which can be calculated using the regression coefficients and the values of the explanatory variable. But practitioners generally do not think in terms of probability but, rather in terms of categories, such as “presence” versus “absence.” Regardless of the statistical methodology being used, the creation of a decision rule requires a relatively large dataset.
منابع مشابه
Forest Stand Types Classification Using Tree-Based Algorithms and SPOT-HRG Data
Forest types mapping, is one of the most necessary elements in the forest management and silviculture treatments. Traditional methods such as field surveys are almost time-consuming and cost-intensive. Improvements in remote sensing data sources and classification –estimation methods are preparing new opportunities for obtaining more accurate forest biophysical attributes maps. This research co...
متن کاملExperimental Evaluation of Algorithmic Effort Estimation Models using Projects Clustering
One of the most important aspects of software project management is the estimation of cost and time required for running information system. Therefore, software managers try to carry estimation based on behavior, properties, and project restrictions. Software cost estimation refers to the process of development requirement prediction of software system. Various kinds of effort estimation patter...
متن کاملComparing the Results of Logistic Regression Model and Classification and Regression Tree Analysis in Determining Prognostic Factors for Coronary Artery Disease in Mashhad, Iran
Background and purpose: Understanding of the risk factors for cardiovascular artery disease, which is the leading cause of death worldwide, can lead to essential changes in its etiology, prevalence, and treatment. The aim of this study was to compare the results of logistic regression model and Classification and Regression Tree Analysis (CART) in determining the prognostic factors for coronary...
متن کاملPrediction of melting points of a diverse chemical set using fuzzy regression tree
The classification and regression trees (CART) possess the advantage of being able to handlelarge data sets and yield readily interpretable models. In spite to these advantages, they are alsorecognized as highly unstable classifiers with respect to minor perturbations in the training data.In the other words methods present high variance. Fuzzy logic brings in an improvement in theseaspects due ...
متن کاملFactors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis
Background: Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predi...
متن کاملUnderstanding the Determinants of Poverty
Regression analysis is commonly undertaken to identify the effects of each of these characteristics on income (or expenditure) per capita. Attention is needed to choose the independent variables carefully, to be sure that they are indeed exogenous. A number of more exotic techniques are now available for this purpose, including classification and regression tree (CART) models and multiple-adapt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011